Exploring the Parallel Programming Design Space of Proximate, a Multi-Tile Programmable Accelerator

نویسندگان

  • Vinay Gangadhar
  • Kai Zhao
چکیده

The slowing of Moores law and Dennard scaling is limiting the performance improvements of single core processors. Increasing clock frequency any farther will lead to high leakage current and infeasible power consumption. Over the past decade, focus has been shifted to multi-core processors to increase throughput by having multiple cores to target different types of parallelism instruction (ILP), data (DLP) and thread (TLP). However, even the multi-core processors are not scalable for parallelization beyond a point due to Amdahl’s law. To address these challenges of rising dark silicon and the end of Dennard scaling, in recent years architects have turned to heterogeneous architectures with special purpose domain specific accelerators (DSAs), for higher performance and energy efficiency. While providing huge benefits, DSAs are prone to obsoletion due to domain volatility, have recurring design and verification costs, and have large area footprints when multiple DSAs are required in a single device to reap out the benefits of different application acceleration. To attack such problems of DSAs and multi-core processors, while still retaining programmability of general purpose processors and efficiency of DSAs, there is an on-going research in Vertical Research Group, University of Wisconsin-Madison aiming to build a general purpose multi-tile programmable accelerator called Proximate. This project focuses on exploring the parallel programming design space of Proximate, a multi-tile programmable hardware accelerator. The aim is to investigate the programming interface of Proximate, the parallel hardware improvements and in general explore the parallel programs run on proximate. We have tried to explore a new type of the multi-tile programmable architecture, its scalability and speedup of such accelerator architecture compared to a traditional server class multi-core processors for parallel programs. In summary, we explored different programming design points in the project and proximate is able to achieve speedups of 50-100x over a traditional server class multi-core processor. We try to analyze this trend over the coarse of this document by explaining the architecture, programming model and the performance results of proximate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MAMPSx: A Design Methodology for Rapid System-Level Exploration, Synthesis of Heterogeneous SoC on FPGA

To achieve better performance and to meet time-to-market demands, heterogeneous reconfigurable MPSoCs like Xilinx Zynq platform are fast becoming popular. However, there are very limited design tools that enable both programming of applications and the exploration of the design space of these applications on the heterogeneous platforms. Also these existing tools are time consuming and require e...

متن کامل

RACER - A Rapid Prototyping Accelerator for Pulsed Neural Networks

In this extended abstract we sketch the employment of programmable logicfor the acceleration of the simulation of pulsed neural networks. We compare our approach to solutions which are based on DSPs and digital neuroprocessors. Our solution is a rapid prototyping accelerator board which is based on a data flow concept. The accelerator provides three module sockets with a rather simple 32Bit int...

متن کامل

Redundancy Allocation Combined with Supplier Selection for Design of Series-parallel Systems

In this paper a redundancy allocation problem is studied where for the first time the supplier selection is taken into consideration and redundant components are provided from appropriate suppliers with the most suitable offers such as discount on buying price of components, warranty length for components, things like that, so that the system reliability, profit and the warranty length proposed...

متن کامل

Secure FPGA Design by Filling Unused Spaces

Nowadays there are different kinds of attacks on Field Programmable Gate Array (FPGA). As FPGAs are used in many different applications, its security becomes an important concern, especially in Internet of Things (IoT) applications. Hardware Trojan Horse (HTH) insertion is one of the major security threats that can be implemented in unused space of the FPGA. This unused space is unavoidable to ...

متن کامل

Field Programmable Gate Array Implementation of Active Control Laws for Multi-mode Vibration Damping

This paper investigate the possibility and effectiveness of multi-mode vibration control of a plate through real-time FPGA (Field Programmable Gate Array) implementation. This type of embedded system offers true parallel and high throughput computation abilities. The control object is an aluminum panel, clamped to a Perspex box’s upper side. Two types of control laws are studied. The first belo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016